Skip to content

Conversation

@losipiuk
Copy link
Member

@losipiuk losipiuk commented Sep 17, 2025

Description

Additional context and related issues

Release notes

(x) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:

@cla-bot cla-bot bot added the cla-signed label Sep 17, 2025
@losipiuk losipiuk force-pushed the lukaszos/use-pre-order-stages-ordering-in-f26809 branch from 7b5b5de to b94e6fc Compare September 18, 2025 11:29
@github-actions
Copy link

github-actions bot commented Oct 9, 2025

This pull request has gone a while without any activity. Ask for help on #core-dev on Trino slack.

@github-actions github-actions bot added the stale label Oct 9, 2025
@wendigo
Copy link
Contributor

wendigo commented Oct 9, 2025

@losipiuk ready for review?

@github-actions github-actions bot removed the stale label Oct 10, 2025
@losipiuk losipiuk force-pushed the lukaszos/use-pre-order-stages-ordering-in-f26809 branch from b94e6fc to 605de60 Compare October 13, 2025 08:28
Rename getSubStagesDeepPreOrder to getSubStagesDeep.
Call spots did not really depend on any specific ordering of returned
stages.
@losipiuk losipiuk force-pushed the lukaszos/use-pre-order-stages-ordering-in-f26809 branch from 605de60 to 0782d71 Compare October 13, 2025 08:38
@losipiuk losipiuk requested a review from kasiafi October 13, 2025 08:38
@losipiuk losipiuk marked this pull request as ready for review October 13, 2025 08:38
@sourcery-ai
Copy link

sourcery-ai bot commented Oct 13, 2025

Reviewer's Guide

This PR refactors StagesInfo to replace the specialized pre- and post-order traversals with unified getSubStagesDeep (pre-order) and getSubStagesDeepTopological (topological) methods and updates all consumer code—ExplainAnalyzeOperator, QueryMonitor, and PlanPrinter—to use the new APIs, ensuring that EXPLAIN output uses a consistent pre-order stage ordering.

Sequence diagram for finding failed tasks in QueryMonitor using new traversal

sequenceDiagram
    participant QueryMonitor
    participant StagesInfo
    participant StageInfo
    participant TaskInfo
    participant TaskStatus
    QueryMonitor->>StagesInfo: getSubStagesDeep(outputStageId, true)
    loop for each StageInfo
        StagesInfo-->>QueryMonitor: StageInfo
        QueryMonitor->>StageInfo: getTasks()
        loop for each TaskInfo
            StageInfo-->>QueryMonitor: TaskInfo
            QueryMonitor->>TaskInfo: taskStatus()
            TaskInfo-->>QueryMonitor: TaskStatus
            QueryMonitor->>TaskStatus: getState()
            TaskStatus-->>QueryMonitor: TaskState
        end
    end
Loading

File-Level Changes

Change Details Files
Refactor stage traversal in StagesInfo to unified pre-order and topological methods
  • Rename getSubStagesDeepPreOrder to getSubStagesDeep and adjust its overload
  • Rename getSubStagesDeepPostOrder to getSubStagesDeepTopological and implement it using a builder, visited set, and reverse logic
  • Remove or update old collectSubStageIdsPreOrder/PostOrder methods and fix their generic types
core/trino-main/src/main/java/io/trino/execution/StagesInfo.java
Update consumers to call new traversal methods
  • Replace getSubStagesDeepPreOrder/getSubStagesDeepPostOrder calls with getSubStagesDeep/getSubStagesDeepTopological
  • Extract StagesInfo into a local variable in ExplainAnalyzeOperator and update state checks
  • Adjust textDistributedPlan in PlanPrinter and failure detection in QueryMonitor to use topological ordering
core/trino-main/src/main/java/io/trino/operator/ExplainAnalyzeOperator.java
core/trino-main/src/main/java/io/trino/event/QueryMonitor.java
core/trino-main/src/main/java/io/trino/sql/planner/planprinter/PlanPrinter.java

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • In findFailedTask you’ve switched from post-order to pre-order traversal, which may report a different failed task than before—double-check that this still aligns with the intended deepest-first behavior.
  • The private helper collectSubStageIdsPreOrder is now used for generic deep traversal but still named “PreOrder”—consider renaming it (and its parameters) to better reflect its updated purpose.
  • The topological traversal builds the list then calls reverse; you could simplify by adding elements in reverse order during the walk to avoid the extra reverse step.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In findFailedTask you’ve switched from post-order to pre-order traversal, which may report a different failed task than before—double-check that this still aligns with the intended deepest-first behavior.
- The private helper collectSubStageIdsPreOrder is now used for generic deep traversal but still named “PreOrder”—consider renaming it (and its parameters) to better reflect its updated purpose.
- The topological traversal builds the list then calls reverse; you could simplify by adding elements in reverse order during the walk to avoid the extra reverse step.

## Individual Comments

### Comment 1
<location> `core/trino-main/src/main/java/io/trino/execution/StagesInfo.java:136` </location>
<code_context>
-        collectSubStageIdsPostOrder(stageInfo, subStagesIds);
-        if (includeRoot) {
-            subStagesIds.add(root);
+        if (visitedFragments.contains(stageId)) {
+            return;
         }
</code_context>

<issue_to_address>
**question:** Cycle detection is introduced but not documented.

Using 'visitedFragments' avoids infinite recursion, but if cycles are unexpected, this could hide data problems. Consider raising an error or warning when a cycle is detected.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

collectSubStageIdsPostOrder(stageInfo, subStagesIds);
if (includeRoot) {
subStagesIds.add(root);
if (visitedFragments.contains(stageId)) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Cycle detection is introduced but not documented.

Using 'visitedFragments' avoids infinite recursion, but if cycles are unexpected, this could hide data problems. Consider raising an error or warning when a cycle is detected.

@losipiuk losipiuk merged commit 928fd0d into trinodb:master Oct 13, 2025
98 of 100 checks passed
@github-actions github-actions bot added this to the 478 milestone Oct 13, 2025
@ebyhr
Copy link
Member

ebyhr commented Oct 14, 2025

@losipiuk Could you update "Release notes" section?

@losipiuk losipiuk added the no-release-notes This pull request does not require release notes entry label Oct 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla-signed no-release-notes This pull request does not require release notes entry

Development

Successfully merging this pull request may close these issues.

5 participants